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Social media impacts society whether these impacts are positive or negative, 
or even both. It has become a key component of our lives and a vital news 
resource. The crisis of COVID-19 has impacted the lives of all people. The 
spread of misinformation causes confusion among individuals. So automated 
methods are vital to detect the wrong arguments to prevent misinformation 
spread. The COVID-19 news can be classified into two categories: false or 
real. This paper provides an automated misinformation checking system for 
the COVID-19 news. Five machine learning algorithms and deep learning 
models are evaluated. The proposed system uses the bidirectional encoder 
representations from transformers (BERT) with deep learning models. 
detecting fake news using BERT is a fine-tuning. BERT achieved accuracy 
(98.83%) as a pre-trained and a classifier on the COVID-19 dataset. Better 
results are obtained using BERT with deep learning models, which achieved 
accuracy (99.1%). The results achieved improvements in the area of fake 
news detection. Another contribution of the proposed system allows users to 
detect claims' credibility. It finds the most related real news from experts to 
the fake claims and answers any question about COVID-19 using the 


universal-sentence-encoder model. 
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1. INTRODUCTION 

Before Internet development, the idea of fake news already existed. However, the appearance of 
social networks such as Facebook and Twitter, increases the speed of spreading fake news, especially if 
nobody can submit any news. Most sites also have an option to share that allows users to propagate the 
website content. Social networking platforms make it possible to share content quickly and efficiently. In a 
short period, users can distribute incorrect consumer reviews are the part of everyday life. User read the 
reviews before purchase, or stores it for finding the best product through comparison of the product review 
[1]. So, detecting fake reviews is important also for sommerical aspects, Facebook and other companies have 
pledged to do more to stop false news from spreading. 

Fake news is one of many sorts of misinformation, including rumors and misleading web material. 
False news is described as "news pieces meant to mislead readers but can be misrepresented by other 
sources" [2], [3]. There has been a global problem of the large spread of misinformation on social media. The 
effects on public opinion and social-political evolution threaten great. The Internet is currently handier than 
other traditional media for users to acquire news. 
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In recent years, the detection of false information has been a significant subject of research. Many 
efforts have been made to overcome problems of false information detection and to identify promising 
approaches. False news is intentional and verifiable misleading news pieces that may mislead the reader. In 
the middle of COVID-19 scenario fraudulent purveyors of the COVID-19 commodity have caused many 
people to suffer, public health group is making significant efforts on social media sites to educate and rectify 
misinformation. The World Health Organization (WHO) declared a "major infodemic" due to coronavirus 
transmission, the worldwide epidemic of COVID-19 has become an international war against misinformation 
[4]. Fake news spreads like an illness during critical times in a country, causing unrest, havoc, and disruption 
among citizens. The use of machine learning to counteract fake news on social media platforms has increased 
dramatically. However, because these algorithms are unaware of context, they have only had limited 
effectiveness in detecting spam and harmful information [5]. 

Two primary methodologies are available to detect credibility: machine learning and profound 
learning. According to Perez-Rosas et al. [6], the workers were tasked to create a fraudulent version of the news 
reports using crowdsourcing, the rumor identification framework is a machine learning strategy that legitimizes 
signals of ambiguous messages so that a person can readily recognize bogus news, people will be notified if 
posts appear to be false [7]. The Twitter crawler, a machine learning approach, collects tweets and adds them to 
a database, comparing distinct tweets [8]. The Twitter credibility issue was initially discussed in an organized 
approach by Castillo et al. [9], [10]. The authors identified several language indicators of disappointment and 
used Random Forest (RF) to detect false information. In detecting bogus news in [11], [12], [13]. Briscoe et al. 
presented some recurrent neural network (RNN) designs such as gated recurrent unit (GRU), Tanh-RNN and 
long short term memory (LSTM) to get the most outstanding possible performance [14]. 

The problem here is how to ensure the credibility of the content of the news if it’s true or fake to not be 
misleading. which algorithm is more accurate and with which feature extraction method that is the first part of 
the problem, the second part is how can the reader check the correct news from experts about specific rumor or 
find the answer of specific question related to COVID-19. Our approach can help improve traditional fake news 
detection strategies, wherein content features are frequently utilized to identify fake news. 

The contributions of the proposed strategies in the paper can be summarize: i) An automated 
misinformation checking system for the COVID-19 news utilizing several approaches on social media 
platforms; ii) Improve the precise identification of existing fake news using a pre-trained model with deep 
learning; iii) The system allows, after detecting the credibility of claims, to find the most related real news 
and the appropriate answer to any question about COVID-19. 

The remainder of this paper is organized: the research method is presented in section 2. The results 
and discussion are introduced in section 3. Finally, the conclusion is presented in section 4. 


2. RESEARCH METHOD 

An automated COVID-19 misinformation checking system is presented. The proposed system uses 
three main approaches as shown in Figure 1. In the first approach uses regular machine learning algorithms it 
detects fake news using five baseline traditional machine learning techniques. The machine learning 
techniques used are naive Bayes (NB), decision tree (DT), linear regression (LR), and support vector 
machine (SVM), and random forest (RF) the second approach is detecting fake news using deep neural 
networks as convolutional neural network (CNN), the LSTM (one to two layers) and the GRU (one to two 
layers). The third approach uses a transformer model BERT which is a pre-trained model. Instead of training 
a transformer model from scratch, it is probably more efficient to use (and eventually fine-tune) a pre-trained 
model (BERT, XLNet, DistiIBERT) from the transformers package. An open-source machine learning 
framework PyTorch and TensorFlow are used. 

The BERT model is the most used with text classification tasks. The original English-language 
BERT model has two pre-trained general types. The BERT gas model: a 12-layer, 768-hidden, 12-heads, and 
a 110M parameter neural network architecture. The BERTLarce model: 24 Encoders with 16 bidirectional 
self-attention heads. Both models are trained on the BooksCorpus with 800M words [15]. The BERTpasz 
model is used because the dataset in this work is not extensive [16]. 

The BERT-large has 16 attention heads with 340 million parameters and 1024 hidden layers bert 
large increases the performance of bert base. A simple softmax classifier is added on top to classify the 
representation. We can use a pre-existing model based on a vast dataset and tune it [17] to achieve other tasks 
on different datasets. The pre_trained models have two main advantages: i) Training costs are constantly 
reduced for a new deep learning model; ii) These datasets fulfill industry-accepted norms, and hence the 
quality aspect of the pre-trained models has previously been examined. 


Automated COVID-19 misinformation checking system using ... (Marina Azer) 


490 im) ISSN: 2252-8938 


[ ose TL 


Pre-processing 
Feature extraction 


Data Splitting 


Testing 
10% 


Training 
90% 


Fine-tuned a 
pre-trained 
model 


Regular ML 
models 


Deep learning 
models 


CNN 
LSTM 


BERT 


Performance measures 
Accuracy - precision 
Recall - F-Score 


Figure 1. The proposed system for fake news detection on COVID-19 data 


The second phase of the model is as shown in Figure 2 that the user can enter a claim that wants to 
detect its credibility if it is real or fake. The proposed system will detect the credibility of the sentence using 
the BERT as a pre_trained model with LSTM deep learning model, which achieved the best accuracy than 
other models. If the claim is real, the system will show that it's an actual claim. Otherwise, it will return a 
fake claim and show the most relevant five real news to this claim from experts’ sayings. The Cosine 
similarity type is used. Each vector's text counts cosine similarity that can be used to express the number by 
vectors by calculating one vector function, namely the angle cosine between the vectors. So, the similitude 
cosine is a Statistic used to decide how similar the phrases are, regardless of their size. The other 
methodology uses universal sentence encoding encoders for text to high-dimensional vectors, which may be 
used to evaluate text, semantic similitude, grouping. 

For the text that is longer than a word, such as sentences, phrases, or short paragraphs. The universal- 
sentence-encoder tensorflow is publicly available with the pre-trained Universal Judgment Code. The model is 
trained and optimized for greater-than-word length text, such as sentences, phrases, or short paragraphs. The 
model is optimized, has trained with a deep averaging network (DAN) encoder on varied data. 


2.1. Dataset 

Insurance claims, which come from numerous sources, are typically the raw data for healthcare 
fraud detection. In addition to insurance claim data, additional types of data utilised in the identification of 
healthcare fraud include information about doctors, information about prescriptions written by doctors, 
information about the medication or substances prescribed, and information about bills and transactions. Data 
on the healthcare system in each nation is distinct. Therefore, the job in fraud detection is assessed while 
taking into account the understanding of the official health data) The U.S. Health Care Financing 
Administration (HCFA) is a significant government agency in charge of health care. In the US, there are two 
healthcare programmes: Medicare and medicaid. Most often, researchers use medicare or medicaid data, 
which includes information on medications and pharmaceuticals, bills and transactions, and medical 
providers, to find frauds and misuse in the healthcare systems [18]. A healthcare misinformation-dataset from 
fake news on blogs and social media, along with the impact that individuals have on such fake news. The 
data contains 4,251 news mentions, 296,000 user engagements, 926 tweets referencing the COVID-19 and 
ground truth label. 
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Figure 2. The second phase of the proposed model for fake news detection on COVID-19 data and 
answer related questions 


2.2. Pre-processing 

Pre-processing is a critical step that affects detection effectiveness significantly. For example, 
Twitter data is a popular, unstructured data set of information collected from individuals who input their 
feelings, opinions, attitudes, product reviews. The basic cleaning steps in this work are: 

— Cutting down: One of the fundamental procedures of cleaning is to turn a word into lowercases. 

— Removal of symbols and irrelevant characters: the punctuations are deleted, such as commas, 
quotations, marks for questioning, and more that give little value to the model. 

— Deletion of URLs: removal of any text links. 

—  Tokenization: to divide the text into units of tokens (words). 

— Deletion of stop-words: it refers to words that do not contribute a lot of meaning to the sentence, delete 
those terms, then automatically fix any misspelling, 

— Lemmatization: lemmatization is used instead of stemming because stemming is just cutting off the 
conclusion of the starting of the word, considering a list of common prefixes and postfixes that can be 
found in an inflected word. 

This aimless cutting can be fruitful on a few events, continuously, whereas lemmatization, on the 
other hand, considers the morphological investigation of the words. It is fundamental to have nitty gritty 
word references that the calculation can see through to connect the shape back to its lemma. Therefore, 
lemmatization accomplishes superior results than utilizing stemming whereas utilizing standard machine 
learning algorithms as shown in Table 1 and Table 2. 


2.3. Feature extraction 

The Term Frequency- Inverse Document Frequency (TF-IDF) with n-gram in machine learning 
algorithms is utilized to offer numerical weighting to textual content utilized for mining. TF-IDF identifies 
how important is a term T depending on based on the frequency and relative relevance of t in document D 
within the entire training dataset D. As indicated in (1), the TF-IDF measure has a weight computed with a 
multiplication of two values: the standardized term frequency (TF) and the inversion frequency (IDF) [19]. 


TF-IDF (td)=TF (td)xIDF (t, D) (1) 


The word embedding feature extraction method is employed during a deep learning technique, 
which usually turns text data, words, into vectors. It is an n-dimensional dense vector that represents every 
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word, where similar words have the same vector. Glove and Word2Vec have shown the best word 
embedding approaches to turn words into vectors. The glove is an uncontrolled word embedding learning 
algorithm. 

The main idea of GloVe is to find out about the proximity of two words. The vector size of 25d, 
50d, 100d and 200d vectors was variable. We have utilized "glove.twitter.27B.zip". We have used a library 
of Keras-tuner [20] to select the optimum collection of hyper-parameters in hidden layers (LSTM and GRU) 
and abandonment layers for hyper-optimization parameters. 

For the fake news classification task, the uncased BERT -Base model is employed. This model 
consists of 12 focus layers with 110M parameters in total. BERT uses the built-in Word Piece. Furthermore, 
before the fine-tuning, the tokenizer with the Google implementation converts the whole text into lowercase. 


2.4. Data splitting 

The dataset is split into 90% training and 10% testing. The training set is fed into the machine 
learning and deep learning models to check what should be done with the data. The test set is used to check 
the results and the algorithm's performance. 


2.5. Learning models 

Different approaches’s models have been used here to detect the fraud using the highest 
performance model, regular machine learning models, deep learning models, and pre-trained models. Which 
achieve a good result in nlp tasks. Moderls are measures by different measurements accuracy, precision, 
recall and f1_score Details will be discussed in the following sections. 


2.5.1. Machine learning models 

Five different machine learning models are used. The models are Naive Bayes (NB), Logistic 
Regression (LR), Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM), as shown 
in Tables 1 and 2. Random Forest (RF) algorithm always achieve good performance over different methods. 
The algorithms were tested with different resampling methods. 


2.5.2. Deep learning models 

Different deep learning models are used. The models are CNN chosen in natural language 
processing (NLP) tasks because the filter slides a few words over the sentence matrices. This makes CNNs 
work well for classification tasks LSTM achieves good results because it can handle long-term dependencies 
by keeping a memory at all times, and GRU has achieved great success in many sequential tasks, as shown in 
Table 3. 


2.5.3. Fine-tune a pre-trained bert model 

The use of a pretrained model has many advantages. It lowers the carbon footprint, lowers 
computation expenses, and enables to use cutting-edge models without having to train one from scratch. For 
a variety of purposes, Transformers offers access to thousands of pretrained models. A pretrained model is 
one that has been trained on a dataset particular to the task. This is referred to as fine-tuning, a very effective 
training method. Fine-tuning BERT is a process that permits it to demonstrate various downstream errands; 
independent of the text form BERT model uses a self-attention mechanism to unify the word vectors as 
inputs that include bidirectional cross attention between two sentences. Primarily, there exist some fine- 
tuning. 


2.6. Models’ evaluation 
The performance of the models is measured by different measurements, which are 
— Accuracy measures correctly identified samples out of all the samples [21]. 


Accuracy = a (2) 
TP+TN+FP+FN*100 
where: TP is the number of positive samples that were appropriately labeled. 
TN is the number of negative samples that were accurately labeled. 
FP is the number of negative samples that have been mislabeled as positive. 
FN is the number of positive samples that were mislabeled negatively. 
— Precision and Recall: Recall measures the model's ability to accurately recognize the occurrence of a 
positive class instance [22], [23]. 
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—  F-Score: The harmonic identification of precision and recall [24]. 

FiScore = oe eesioneces ; 


precision+Re call 


3. RESULTS AND DISCUSSION 

In this sectio, the results obtained in our experiments for the COVID-19 dataset. The regular 
machine learning algorithms using the two sizes of TF-IDF feature extraction method including unigram and 
bigram representation. The random forest and support vector machine algorithms achieved the highest 
results, as shown in Table 1 and Table 2 with bigram representation and lemmatization method instead of 
stemming, stemming is faster, but lemmatization is more accurate. 


Table 1. The performance of regular machine learning algorithms using stemming [25] 


N-gram Algorithm Accuracy Precision __ Recall Fl-score 
Unigram 96.57 96.22 96.57 95.47 
Bigram RF 96.63 96.41 96.63 95.52 
Unigram 96.17 95.64 96.17 94.55 
Bigram NB 96.24 95.68 96.24 94.72 
Unigram 96.22 95.19 96.23 95.35 
Bigram DT 96.14 95.08 96.18 95.26 
Unigram 96.28 95.4 96.28 95.01 
Bigram LR 96.38 95.73 96.38 95.17 
Unigram 96.63 96.42 96.63 95.54 
Bigram SVM 96.64 96.45 96.64 95.53 


Table 2. The performance of regular machine learning algorithms using lemmatization 


N_gram Algorithm Accuracy Precision Recall F1_score 
Unigram 97.1 96.9 97.0 95.6 
Bigram RF 97.65 97.1 97.2 96.3 
Unigram 97.2 96.6 97.3 95.6 
Bigram NB 97.12 96.9 97.6 95.7 
Unigram 97.4 95.71 97.41 95.92 
Bigram DT 97.2 95.9 97.01 95.8 
Unigram 97.15 96.4 97.91 96.2 
Bigram LR 97.1 96.32 97.1 96.5 
Unigram 97.4 97.7 97.4 96.9 
Bigram SVM 97.71 97.5 97.2 96.74 


Deep learning models achieved better performance than regular machine learning algorithms 
because of their ability to learn the discriminatory features through the multiple hidden layers with the word 
embedding feature extraction method. The deep learning models are CNN, LSTM, GRU compared to the 
pre_trained model BERT. LSTM achieved higher results than other deep learning models using bert as 
encoder representation, as shown in Table 3. While using BERT as a bidirectional encoder representation 
from transformers model and a classifier, it achieves the highest accuracy. 


Table 3. The performance of deep learning models compared to pre_trained and classifier bert 


Feature Extraction Method Model Accuracy Precision Recall Fl-score 

Glove 92.6 94.2 93.2 93.6 

Bert CNN 86.6 86.2 85.4 86.5 

Glove LSTM 97.92 97.3 97.4 97.3 

Bert 99.1 99.2 99.6 99.2 

Glove GRU 91.7 93.2 92.5 92.4 

Bert 82.9 83.2 84.8 83.4 

BERT 98.83 98.4 99.4 99.3 
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In Figure 3 we compare our results to the best-known literature results since any precise work is 
required. Fake news detection using pre-trained model BERT as a pre-trained model and a classifier achieved 
higher accuracy than other models. But while using BERT as a pre-trained model with LSTM model, it 
achieved enhancements in the results and achieved the best results than all other models achieved on the 
same dataset in [25] while using modified LSTM and modified GRU. 
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Figure 3. Comparison between our models and best models 


4. CONCLUSION 

One of the most challenging things for a human being is detecting bogus news. This work has 
presented a system to detect misleading health information. Two methodologies applied machine learning and 
deep learning approaches.in machine learning lemmatization method achieved enhancement than using 
stemming. The approaches of deep learning outperform the methodology of machine learning. The best solution 
is obtained through the combination of the deep learning models with pre-trained models each model 
performance is discussed. The content characteristics in this study are employed in the binary categorization. 
After detecting the credibility, the user can check experts’s sayings about any rumors related to COVID-19. In 
the future, work will include enhanced information from the Twitter official WHO, UNICEF, and ICRC, which 
we have collected from the bottom up. An extended framework will be offered for data coverage written in 
another language such as Arabic, Spanish and Spanish and categorizes data by utilizing different pre-trained 
models to identify their performance effects. In the classification, we will aim to use several features. 
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